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Abstract We present a new method for the reconstruction of Sunyaev-Zel'dovich 
(SZ) galaxy clusters in future SZ-survey experiments using multiband bolometer 
cameras such as Olimpo, APEX, or Planck. Our goal is to optimise SZ-Cluster 
extraction from our observed noisy maps. We wish to emphasize that none of the 
algorithms used in the detection chain is tuned on prior knowledge on the SZ - 
Cluster signal, or other astrophysical sources (Optical Spectrum, Noise Covariance 
Matrix, or covariance of SZ Cluster wavelet coefficients). First, a blind separation 
of the different astrophysical components which contribute to the observations is 
conducted using an Independent Component Analysis (ICA) method. This is a 
new application of ICA to multichannel astrophysical data analysis. Then, a recent 
non linear filtering technique in the wavelet domain, based on multiscale entropy 
and the False Discovery Rate (FDR) method, is used to detect and reconstruct 
the galaxy clusters. Finally, we use the Source Extractor software to identify the 
detected clusters. The proposed method was applied on realistic simulations of 
observations that we produced as mixtures of synthetic maps of the four brightest 
light sources in the range 143 GHz to 600 GHz namely the Sunyaev-Zel'dovich 
effect, the Cosmic Microwave Background (CMB) anisotropies, the extragalactic 
InfraRed point sources and the Galactic Dust Emission. We also implemented 
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a simple model of optics and noise to account for instrumental effects, assum- 
ing nominal performance for the near future SZ-survey Olimpo, our detection 
chain recovers 25% of the cluster of mass larger than 10 14 Mq, with 90% pu- 
rit y. Our results a r e com pared with those obtained with the published algorithms 



of 



Pierpaoli et al 



I 2005J) . As for global detec tion efficiency, this ne w method is 



Pierpaoli et al 



l 20051) in the high- 



impressive as it provides comparable results to 
purity/low completeness region, being however a blind algorithm: (ie without using 
any prior on data to be extracted). 



Key words. Cosmology : Sunyaev-Zel'dovich effect, Methods : Blind source separation, 
Source extraction, Data Analysis 



1. Introduction 



In the last years, spectacular advances in cosmology happened throu gh observation of 



the sk y at millimete r and sub-millimeter w avelengths. T he Boomerang IINetterfield et al 



2002), MAXIMA (Hananv et al 



Hins haw et al.ll2003(K CBI (Contald i et al 



2000), Archeops ( Tristram^et^al 



20oi 



WMAP 



2002) and DASY l Halverson et alJl2002 ) ex- 



periments measured the anisotropies of Cosmic Microwave Background with high preci- 



sion. Combined w ith other observati ons, such as distant supernovae of IIRiess et al.lll999 



Knop et al 



200j) and/or Cl usters llWhite et 



Scale Structure Observation IjEisenstein et al 



al 



1999: 



5 » 



Pierpaoli et al 



l pr 



2001) or Large 



2005), the former experiments allowed to 



place tight constraints on parameters of generic cosmological models. These results gave 
rise to new questions. Consensus cosmological models assume that the large scale evolu- 
tion of the Universe is driven by Dark Matter and Dark Energy. Dark Matter has been 
sought for 30 years, and most candidates have been rejected in light of experimental 
results. Dark Energy will be one of the toughest challenges of modern cosmology. One 
way to learn about the nature of dark energy, is to study the evolution of the Universe at 
late times during which dark energy is thought to have impacted the formation of large 
scale structures. Cluster surveys are an ideal tool for this purpose. 
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1.1. Cluster detection in SZ survey 

Large scale structures can be studied u sing several probe s: optical galaxy catalogs, 



X-Ray clus ter surveys such as XMM-LSS (IPierre et al 



as Olimpo IjMasi et al 



2004J) , SZ cluster surveys such 



2001), AMIBA IIAMIBA 



2005-2007) 



2003), APEX, AMI ijJone 
and weak shear such as CFHT-LS "wide" . The mass and redshift distribution of clusters 
of galaxies: dN/dzdMdil (where M, z and fl are the cluster mass and redshift, and the 
solid angle respectively), is predicted with fair precision using Halo models and N-body 
simulations. SZ cluster, X-Ray and Optical surveys measure quantities related to cluster 
mass through complex phenomena, such as interstellar gas heating processes, clustering, 
processes of stellar formation and evolution, that are still not fully understood. The SZ 
signal from galaxy clusters is believed to be the simplest direct observable of cluster mass, 
though some uncertainties on extragalactic gas heating mechanisms remain. Measuring 
the cluster mass function will require these systematics to be understood. Pioneering SZ 
cluster surveys will take their first data this year. The Planck satellite survey will follow 



shortly l Lamarre et alJ l2004'). In this paper we prepare for their analysis, using simulated 

data. 

2004) following Herranz et al. I 2002bla ) developped a cluster ex- 



JB. Melin (Mclin 



traction tool based on an optimised filter. The filter is optimised to damp the noise as 
well as the other astrophysical components. SZ signal is small compared to the other 
astrophysical sources, rendering the task of filtering difficult. In addition a filter damps 
some of the signal at the same time as it rejects parasitics. We then developped the 
idea of a 3 steps method using as a first step a source separation algorithm. The source 
separation algorithm sorts the noisy SZ signal from the other astrophysical components 
and the filtering step is left with the easier task of improving SZ cluster signal versus 
noise. Finally a detection algorithm extracts the SZ cluster candidates. 

In the 100 to 600 GHz range, the brightest components of the sky are the Cosmic 
Microwave Background (CMB), the Infrared Point Sources, the Galactic dust emission 
and, swamped in the previous ones, the SZ clusters. It follows that the true sky map 
J„(t?, ip), in a given optic band centered on is, can be modeled as a sum of distinct 
astrophysical radiations as in 



X v (0, tp) = CMB U {$, ip) + IR,,(0, ip) + Gal v {-d, <p) + SZ„{d, tp) 



(1) 
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where ip denote spatial or angular indexes on 2D or spherical maps. The observed 
sky map is the result of the convolution of the true sky with the optical beam of the 
telescope plus the unavoidable contribution of instrumental noise N u ($,(p). Multiband 
signal processing in astrophysical applications deals with exploiting correlations between 
data maps at different v, to enhance the estimation of specific objects of interest such 
as the power spectrum of the CMB spatial fluctuations, or the mass function of galaxy 
clusters. Considering the case where the radiative properties of the sources are completely 
isotropic in the sense that they do not depend on the direction of observation, the above 
model can be rewritten in the following factored form: 

X u {d,<p)=Y j au,iSi{V,v) + N u ifi,<p) (2) 

i 

where Si is the spatial template and a v ^ the emission law of the % th astrophysical com- 
ponent. Although this is a coarse approximation in the case of Infrared Point Sources as 
described in section it is mostly valid for the other three components. With obser- 
vations available in m channels, assuming the beam varies only slightly as a function of 
is, equation J2J) can be written in matrix form : 

X(f,<p)=AS(0,(p) + N(#,<p) (3) 

where -X"($, tp) is a vector in R m , A is an m x n matrix, n is the number of contributing 
astrophysical components, ip) is now a vector in R™ and iV($, ip) in R"\ Equation © 
expresses that the observations consist of linear mixtures of astrophysical components 
with different weights and additive noise. Assuming no prior knowledge of A, recovering 
the particular component of interest, the SZ effect map of galaxy clusters, can be seen as 
a blind source separation problem which we approach in this paper using Independent 
Component Analysis (ICA). Such methods have been previously used quite successfully 
in the analysis of astrophy sical data from present or future multichannel experiments 



such as WMAP or Planck ( Delabrouille et al 



2003; 



Maino et al 



2002; 



Kuruoelu et al 



2003|) where the focus is on estimating the map of CMB anisotropics. We concentrate here 
on separating the SZ component which has very different statistical properties compared 
to CMB and thus requires the use of other ICA methods. Then, due to the additive in- 
strumental noise, the source separation process has to be followed by a filtering technique. 



We use an ite rative filtering based on Bayesian methods to filter the noise 



I Starck et al 



2005). This method uses a Multiscale Entropy prior which is only de- 



fined for non-significant wavelet coeffic ients selected by the False Discovery Rate (FDR) 



Method IjBeniamini fc Hochberg 



1995). 



Then, w e identify SZ clusters in filtered maps using the Source Extractor software 



SExtractor l|Bertin 



2003) which turned out to conveniently provide source extraction 



along with adequate photometry and de-blending. 



1.2. Outline of the paper 



In t he following, we first describe our sky model. We use the Olimpo SZ-cluster 



survey l|Masi et al 



2003) as an example of a future ambitious SZ survey. We model 



the Olimpo instrument and produce simulated observed maps. With minor parameter 
changes, these algorithms will also allow us to simulate observed local maps typical 
of other bolometer camera SZ surveys (i.e Planck or APEX observations). Then we 
explain the data processing methods used to recover the map of SZ-cluster signal and 
the "observed" cluster catalog. We quantify the efficiency of our detection chain and 
compare it to a recently published method, thanks to cooperation of the authors. This 
paper is at the interface of astrophysics and applied mathematics. We tried to write an 
article understandable for both communities. 



2. Simulations of contributing astrophysical components 

The input of our simulation is a list of cosmological model parameter values. We use 
typically a KCDM cosmological model with il t = 1, fi m = 0.3, = 0.04, Qdm = 0.26, 
SIa = 0.7, h = 0.7, eg = 0.85, n s = 1, r = 0.1666 and a seed for the random number 
generator. This simulation was fully written in C++, so that we can mass produce our 
simulations on a PC farm, when needed. 



2.1. Cosmic Microwave Background anisotropies 

Big Bang cosmology models assume that the Universe has been evolving from a hot 
and dense plasma. While it expands, its contents is cooling down. At some time, so called 
decoupling, when the temperature of the Universe lowered below hv = ksT ss 0.2 eV, the 
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mean free path of photons suddenly became infinite so that a large majority of them have 



been propagating freely through the Universe ever since. These photons are no 



waday s 



observed having a black-body spectrum at temperature 2.728K (Hin shaw et al 



2003): 



the Cosmic Microwave Background (CMB). Small anisotropies in the CMB are observed 
and interpreted as resulting from temperature and density fluctuations in the primordial 
plasma at decoupling time. 

Bolometer cameras measure incoming power variations, while scaning the sky. Thus the 
main blackbody spectrum (monopole) won't be observed. In the following simulations 
we also assume that the CMB dipolc (Doppler effect due to Earth mouvement) has been 
subtracted from the data. 



In order to simulate a map of CMB anisotropies, we first enter the cosmological param- 



eter list in the cmbeasy code i Doran fc Mueller 



2004^1 and compute a Power Spectrum 



in Spherical Harmonics up to multipole 5000, of the CMB fluctuations. Then using a 



small field approximation I White et alJll99fl ). we generate a random spatially correlated 



Gaussian field of primordial anisotropies with the previous power spectrum. Figure ^ 
shows a typical simulated CMB anisotropy map in units of j-iK. 



Figure 1. The 4 physical components of the sky included in our simulation: upper right 
is a map of the CMB's anisotropies in unit of /iK, upper left is the SZ Cluster map, in 
unit of y Compton, lower right is the IR point source map, convolved with a beam of 2 
arcmin, in Jy at 350GHz, finally lower left is the Galactic dust map in unit of MJy/st 
at 100 /zm. 



2.2. IR point sources 

In our frequency range, infrared point sources are expected to be a significant con- 
tamination. Borys et al. from the SCUBA experiment published a list of infrared point 
sources at 350GHz and a logN/LogS law from this data. We extend this law to lower 
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brightness, and generate a catalog of IR point sources in our field of view. We assume 
that their optical spectra follow a grey body law parameterised as: 



F(v) = v a * Planck(v, T ). (4) 

where Planck stands for Planck's black body spectrum, v stands for frequency, To is the 
temperature of the black body and a is by definition the spectral index. For each IR 
point source, the spectral index was randomly picked between 1.5 and 2, and To was set 
to 30 K. Thus each IR point source has a different optical spectrum. Then the IR point 
source is placed on our map as a single pixel. IR sources are positioned randomly on the 
map but a user chosen fraction of them can be positioned within clusters of galaxies. We 
get an IR point source map in units of Jy at 350 GHz as shown on Figure Q ■ 

2.3. Galactic Dust Emission 

We know since FIRAS and IRAS that the galactic interstellar dust emits in the far in- 
frared. We wanted for these simulations to generate truly random galactic dust maps. So, 
instea d of using the maps published by Schlegel et al. , we followed Bouchet and Gispert 



1 1999) and assumed that Galactic Dust was described by a power spectrum of spatial 
correlations in C'i oc I /I 3 . We then generated randomly a map with the same algorithms 
as used for the CMB map simulations. We assume a homogeneous optical spectrum over 
our map given by equation0]with a =2, and Tq= 20K. Finally, we normalise the map rms 
at its observed values at high galactic latitude: at A =100 fira, the rms flux is observed 
at lMJ/st. We get a galactic Dust Emission Map, shown on figure^m natural units of 
MJ/st at 100 um. 

2.4. Sunyaev-Zel'dovich Clusters 

Hot intracluster gas is observed as a plasma at temperature 2 to 20 keV. A small frac- 
tion of CMB photons traveling through clusters Compton-scatter on electron gas. Since 
the electron kinetic energy is much larger than the CMB photon energy, CMB pho- 
tons statistically gain e nergy when diffused. This effect, known as the thermal Sunyaev- 



Zel'dovich effect |l980j), results in a small distortion in the CMB Black-Body spectrum, 
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in the direction of Clusters of galaxies. The probability that a photon scatters while 
crossing a cluster is given by: 



kT e ka T 
-n e GTdi = 



2 I ^e'^e 1 - 
los 



(5) 



los m e c m e C 

The optical spectral dependance is given by: 

(fcT ) 3 / x \ 

J{ ' (h pC y (e* -l) 2 \tanh(x/2)-4J {> 

In these expressions, T e ,n e ,m e refer to the electron temperature, density and mass, 
respectively; ctt = 6.65 x 10 _25 cm 2 is the Thomson cross-section and x = ku/kTcMB is 
the dimensionless frequency of observation in terms of unperturbed CMB temperature 
T= 2.728K . 



Simulation of SZ Cluster map requires modelling large sca le structure formation. 
For this we translated in C++ the ICosmo semi-analytic model iRefregier fc R. Massev 



2004|) which, starting from a cosmological parameter set, allows us to compute the cluster 
density versus mass and redshift: dn/dzdMdVL(M,z). Several fits to data and physical 
assumptions are involved and implemented in this computation. We run C-ICosmo using 
the options of conc ordance ACDM model and the mass function from Seth and Thormen 



1 Sheth et al 



collapse l|Peeble£ 



2001). T he virial radiu s of a cluster is computed assuming a spherical 



1993 



Peacok 



1999). This is the radius where the average overdensity 



is A c = 1 + 5 = 200 times the critical density (chose n independent of redshift). Gas 



heating is modeled according to IjPiernaoli et al 



2003), with the additional parameter 



T* . Then we randomly generate a cluster catalog in our field of view, according to 
density profile keeping those clusters which have an integrated Compton parameter Y 
larger than 3.10~ 12 st (corresponding to a mass cut of 10 14 Solar Mass). 



An cllipticity is attributed to each cluster according to experimental observations 



I Coorav 



2000). We assume that the cluster gas density follows an elliptical beta model. 



For the time being, clusters are positioned on the map randomly and the orientation of 
the main axis of each ellipse is also chosen randomly. We project to account for two point 
correlations in cluster positions in the near future. A typical SZ cluster map in natural 
unit of y Compton is shown on Figure ^ 
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Figure 2. Simulated maps in Olimpo's four frequency bands, upper right is the 147 
GHz Band, upper left) is the 217 GHz Band, lower right is the 385 GHz Band: CMB 
anisotropies, IR point sources and Galactic Dust blend in this band lower left 500 GHz: 
IR point sources and Galactic Dust are the dominant features at high frequencies. SZ 
cluster signal is dominated by other astrophysical sources at all frequencies. 

3. Modeling observations 

3.1. Instrumental effects 

A detailed model of observations can only be developed after data acquisition. In the 
following, we choose to use a very simple observation model, nevertheless representative of 
an ongoing project, the Olimpo balloon experiment. The Olimpo camera will observe the 
sky at 4 frequencies, 143, 217, 385 and 600 GHz. The millimeter-wave filters' bandwiths 
will be close to 30 GHz and are assumed to be of tophat shape. The beams at the 4 
frequencies are assumed to be symmetric, Gaussian and of 3, 2, 2 and 2 arcmin FHWM 
respectively. The optical efficiency (mirrors, filters), the photon noise and the bolometer 
sensitivities and noise contributions are summarized by an equivalent observed noise 
temperature n eq x on the Sky in unit of /iKcmb/Hz 1 ^ 2 ■ Based on the BOOMERanG 
experiment, typical values in the four channels of Olimpo are expected at 150, 200, 500, 
and 5000 //KQjyjg/Hz^ 1 / 2 respectively. 



3.2. Noise model 

The simplest and most optimistic model is to assume that the observed noise 
along bolometer time lines will be white, in which case, after projection (or coaddition. 



l Yvon fc Mavet 



2005)) on 2D maps, pixel noise can be very easly computed as: 



noispi X = n eq T * y N Bo i * t pix (7) 

where Nb i stands for the number of bolometers working in a particular frequency band, 
and tpi x is the observation length on this pixel. High resolution SZ bolometer surveys 
will cover their large survey adding up small patches of the sky. Noise map patterns are 
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likely to be complex, and will only be known after data has been acquired. We chose in 
this first paper to begin with a simple case of homogeneous white noise on the full field 
of view. 



Frequency [GHz] 


143 


217 


385 


600 


FWHM [arcmin] 


3,5 


2 


2 


2 


White noise level \\iKj\J Hz] 


150 


200 


500 


5000 



3.3. Mixing model 

We now have all the tools to compute simulated observed sky maps. For each fre- 
quency band, we compute a conversion factor from the maps in their natural units into 
the observed unit at the bolometer level namely pW/ m r / 'at. The exception is the IR point 
source map where, because of the random spectral indexes of the point sources, a conver- 
sion factor is computed for each point source. We sum the four physical components into 
a "true" sky map which is then convolved with the experimental beam. Then we add the 
noise. We end up with one map per frequency band (figurc[2Jl . These maps would be what 
the analysis team would recover from the data, after pointing reconstruction, parasitics 
removal, de-correlation of instrumental systematics in the data, and map-making: a big 
analysis work. In the following, we explain a set of algorithms optimised for recovering 
SZ cluster signals in the observed maps. 

Figure 3. SZ component map extracted by JADE from the four observed noisy maps. 
The SZ cluster signal, subdominant at all observed frequencies, now appears clearly. No 
obvious leftovers from other astrophysical sources are seen. Remaining noise is small, 
because we prefiltered data before JADE processing, and we simulated the nominal noise 
levels of an ambitious project: Olimpo. 

4. Separation of Astrophysical components 

The simulated observations generated according to the method described in the pre- 
vious section, consist of linear mixtures of the four main sources of diffuse radiation as 
expected in the four channels of the Olimpo instrument. Focusing on separating the SZ 
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map, we want to make the most of the data in the four frequency bands (centered on 
143, 217, 385 and 600 GHz) and exploit the fact that each map is actually a different 
point of view on the same scene : multichannel data is about processing these maps in a 
coherent manner. Next is a quick overview of the general principles of ICA for component 
separation from multichannel data. Then, we give a more detailed description of JADE, 
the specific ICA algorithm we used. 



4.1. Independent component Analysis 

Blind Source Separation (BSS) is a problem that occurs in multi-dimensional data 
processing. The overall goal is to recover unobserved signals, images or sources S from 
mixtures of these sources X observed typically at the output of an array of m sensors. 
The simplest mixture model takes the form of equation |21 

X = AS + N (8) 

where X and S are random vectors of respective sizes m x 1, n x 1 and A is an m x n 
matrix. The entries of S are assumed to be independent random variables. Multiplying 
S by A linearly mixes the n sources into m observed processes. 

Independent Component Analysis methods were developed to solve the BSS prob- 
lem, i.e. given a batch of T observed samples of X, estimate the mixing matrix A 
and reconstruct the corresponding T samples of the source vector 5 1 , relying mostly on 
the statistical independence of the source processes. Note that with the above model, 
the independent sources can only be recovered up to a multiplication by a non-mixing 
matrix i. e. up to a permutation and a scaling of the entries of S. Although independence 
is a strong assumption, it is in many cases physically plausible. The point is that it 
goes beyond the simple second order decorrelation obtained for instance using Principal 
Component Analysis (PCA) : decorrelation is not enough to recover the source processes 
since any rotation of a white random vector remains a white random vector. 



Algorithms for blind component separation and mixing matrix estimatio n depend 



on th e a priori model used for the probability distrib utions of the sources (Cardoso 



2001 



2001 ) alilsou.nl! rails<T coaix' asssimpt ion^ car, be made i CardosJl flfl .s 



Hvvarinen et al 



). In a first set of techniques, source separation is achieved in a noise-less set- 
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ting, based on the non-Gaussianity of all but possibly one of the co 



mpon ents. Most 



mainstre am ICA techniques be long to this category : JADE l)Cardosolll999[) . FastICA, 
Infomax IjHvvarinen et al In a second set of blind techniques, the components are 



modeled as Gaussian processes and, in a given representation (Fourier, wavelet, etc.), 
separation requires that the sources have diverse, i.e. non prop ortional, variance profiles 



2003: 



For instance, the Sp ectral Matching ICA method (SMICA) l|Delabrouille et al 
Mouddcn et al . 2005), considers in this sense the case of mixed stationary Gaussian com- 



ponents in a noisy context : moving to a Fourier representation, the idea is that colored 
components can be separated based on the diversity of their power spectra. 

In the case where the main component of interest is well modeled by a peaked distri- 
bution with long tails (e.g. Laplace distribution) as is the case with SZ maps, methods 
from the first set are expected to yield better results. Next is a description of JADE, the 
non-gaussian ICA method we used. 

4.2. JADE 

The Joint Approximate Diagonalization of Eigenmatrices method (JADE) assumes a 
linear mixture model as in © where the independent sources S are non-gaussian i.i.d. 1 
random processes with the additional assumption of a high signal to noise ratio (i.e. 
N rs 0).The mixing matrix is assumed to be square and invertible so that (de)mixing is 
actually just a change of basis. Although the noise-less assumption may not be true in 
the problem at hand, the algorithm may still be applied and in fact, a proper change of 
representation can get us closer to such a setting, as discussed in the next section FOl 

As mentioned above, second order statistics do not retain enough information for 
source separation in this context: finding a change of basis in which the data covariance 
matrix is diagonal will not in general enabl e to identify th e independent sources properly. 
Nevertheless, decorrelation is half the job 1 Cardosol 199S ) and one may seek the basis in 



which the data is represented by maximally independent processes among those bases 
in which the data is decorrelated. This leads to so-called orthogonal algorithms: after a 



1 The letters i.i.d. stand for independently and identically distributed meaning that each of 
the entries of X at a given position t are independent of X at any other position i and that the 
distribution of X does not depend on position. 
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proper whitening of the data by multiplication with the inverse of a square root of the 
covariance matrix of the data W , one is then seeking a rotation R (which leaves things 
white) so that S defined by 

S = W- 1 Y = W- X RX whitc = W~ 1 RW X (9) 

and B = A^ 1 = W^ 1 RW are estimations of the sources and of the inverse of the 
mixing matrix. 



JADE is such an orthogonal ICA method and, like most mainstream ICA techniques, 
it exploits higher order statistics so as to achieve some sort of non linear decorrelation. 
Precisely, in the case of JADE, statistical independence is assessed using fourth order 
cross cumulants defined by : 



Fijki = cum(yi,yj,y k ,yi) 

= £(yiVjykyi) - £{yiyj)£{ykyi) - £(y%vi)£{viVk) - £{yiyk)£(yjyk) (10) 

(11) 



where £ stands for statistical expectation and the j/j's are the entries of vector Y modeled 
as random variables. Then, the correct change of basis (i. e. rotation) is found by somehow 
diagonalizing the fourth order cumulant tensor. Indeed, if the y^s were independent, all 
the cumulants with at least two different indices would be zero. As a consequence of the 
independence assumption of the source processes S and of the whiteness of Y for all 
rotations R, the fourth order tensor F is well structured: JADE was precisely devised to 
take advantage of the algebraic properties of F. JADE's objective function is given by 

ij fc/J 



which can be interpreted as a joint diagonalization criterion. Fast and robust algorithms 
are available for the minimization of j7jApp(-R ) with respect to R based on Jacobi's 



in 1 Cardoso 



1999 



1998; 



° 2 



me thod for matrix diagonalizatio n (Pham 2001). More details on JADE can be found 



Hvvarinen et alJ l20011 
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4.3. JADE in wavelet space 

We chose to use JADE after a Wavelet transform. Wavelets come into play as a 
sparsifying 2 transform. Moving the data to a wavelet representation does not affect its 
information content and applying a wavelet transform on both sides of JHJl does not affect 
the mixing matrix and the model structure is preserved. However, the statistical distri- 
bution of the data coefficients in the new representation is different: wavelets are known 
to lead to sparse approximately i.i.d. representations of structured data. Further, the 
local (coefficient wise) signal to noise ratio depends on the choice of a representation. A 
wavelet transform tends to grab the informative coherence between pixels while averaging 
the noise contributions, thus enhancing structures in the data. Although the standard 
ICA model is for a noiseless setting, the derived methods can be applied to real data. 
Performance will depend on the detectability of significant coefficients i.e. on the sparsity 
of the statistical distribution of the coefficients. Moving to a wavelet representation will 
often lead to more robustness to noise. 

Once the data has been transformed to a p roper representation (e.g. wavelets but 



also ridgelets and curvelets Iptarck et al 



2003) should the 2D or 3D data be strongly 



anisotropic), we apply the standard JADE method to the new multichannel coefficients. 
Once the mixing matrix is estimated, the initial source maps are obtained using the 
adequate inverse transform (figure |2J. 



5. Clusters Restoration method - Noise filtering 

JADE has been used to separate the signals from four mixtures and four sources in the 
presence of Gaussian noise. We have added the expected experimental level of Gaussian 
noise at each observed mixture map. In order to prevent bias induced by pixellisation 
along our simulation and detection algorithms, we overpixellised our observed maps, 
and thus pixel noise is enhanced. But all algorithms for BSS that require whitening are 
sensitive to additive noise. In order to minimise the impact of this noise at the ICA 

2 Data is sparse on a basis when this basis allows to describe that signal with a small number 
of coefficients. This is a higly desirable property, since noise is not expected to be sparse at the 
same time on such a basis. Choosing a sparsifying basis thus allows to enhance signal to noise 
ratio. 
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step, we chose to convolve our observed map before JADE with a gaussian of optical 
beams' width. Then in order to optimise signal recovery, we have to perform a filtering 
to remove remaining noise. We tested different methods for denoising. Thanks to the 
original simulated SZ map (without noise), we can easily compare the results of those 
filterings in the next part. 

5.1. Linear Filtering 
5.1.1. Gaussian filter 

A rather common linear filtering technique uses a Gaussian filter, generally isotropic. 
The standard method consists in convolving the observed map S bs with a Gaussian 
window G with standard deviation ctg : 

Sg = G * S b s (12) 

The filtering depends strongly on the value of Sg- This value is adjusted arbitrarly 
or based on a priori information. 



5.1.2. Wiener filter 

An alternative to Gaussian filtering is Wiener filtering which is a method that at- 
tempts to minimize the mean squared error between the original and the restored signal. 
Wiener filtering consists in convolving the observed map S b s with a weight function that 
is to say by assigning the following weight to each mode in Fourier Space: 

m . . jjwE us) 

|S(*)|* + |AW 

where |5(A;)| 2 is a model of the map power spectrum and is in practice derived from 
the data. The weight function makes it possible to attenuate or to remove part of the 
frequencies if the signal-to-noise ratio is low. The filtering depends on the model of the 
noise. The Wiener filter is the optimal filter if both the signal and the noise are well mod- 
eled as Gaussian Random Fields. This condition is not verified for SZ maps which display 
highly non-Gaussian features. Nevertheless, Wiener filtering generally outperforms the 
simple Gaussian filtering. 
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In the following paragraphs, we have tested a state of the art non linear filtering 
method in order to improve signal recovery. 



5.2. Multiscale Entropy method using False Discovery Rate (FDR) 
5.2.1. Maximum Entropy Method (MEM) 

The Maxi mum Entropy Method (MEM) is commonly used in astronomy for im age 
processing (see lstarck et all fcoOljk Marshall et all l|2002|) : Istarck &; Murtaehl feoO^ for 



a full description). It is based on entropy and Bayesian methods. The Bayesian approach 
provides the means to incorporate prior knowledge in data analysis. Choosing the prior 
is one of the most critical aspects of Bayesian analysis. Here an entropic prior is used. In 
data filtering, an entropy functional assesses the information content of a data set. Among 
several possible definitions of e ntrop y, the most commonly used in image processing is 



the Gull and Skilling definition 



1991 



= EE 



S(k,l)-m(k,l)- S(k,l) In 



( S(k,l) 
\m(k, I) 



(14) 



where to is a model, chosen typically to be a sky background. H a has a global maximum 
at S — to. However, MEM does not allow negative values in the solution, which is a 
problem in experimental SZ data analysis, wh ere we measure fluctua tions about zero. To 



overcome this problem, it has been proposed ( Maisiiiee r et al 

' >(M) + S{k,l) 

} </A K J 1 ) ~ zm — 0\K,l) ill I 

k I 



2004) to replace H a by: 



H +/- ( s ) = E E ^< o - 2m - 5 ( fc < o ln 



2to 



(15) 



where "4>{k, I) = y / S 2 (k, I) + Am 2 . Here to does not play the same role as in i|15J) . It is a 
constant fixed to the expected signal rms. 

To overcome the difficultie s encountered by the M EM to restore images containing 



both high and low frequencies, 



Pantin fc Starckl ((1296) have suggested a definition of en- 



tropy in a multiscale framework which we describe in the next section. It has been shown 
that the main drawbacks of the MEM (i.e. model dependent solution, oversmoothing of 
compact objects, . . . ) disappear. 



5.2.2. Multiscale Entropy 

1. Multiscale Entropy definition: 
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The Multiscale Entropy method is based on the standard MEM prior derived from the 
wavelet decomposition of a signal. The idea is to consider the entropy of a signal as 
the sum of the information at each scale of its wavelet transform. And the information 
of a wavelet coefficient is related to the probability of its being due to noise. 
The Undecimated Isotropic Wavelet Transform (UIWT) decomposes an n x n image 
S as in : 

S(k,l) = C J{k4) +Y J w H k.i)' 

3 = 1 

where Cj is a coarse or smooth version of the orig inal image S and Wj represents 
the details in S at scale fsee lStarck fc Murtaghl for details). Thus, the algorithm 
outputs J + 1 sub-band arrays of size n x n. We will use an indexing convention such 
that j = 1 corresponds to the finest scale (high frequencies). 

Denoting H (S) the information relative to the signal and h(wj(k, I)) the information 
relative to a single wavelet coefficient, the entropy is now defined as: 



(16) 



H(s) = h(Cj(k,i)) + Y^ E ftto-(M)) 

j=i k,i=i 

where I is the number of scales and Nj is the size of map in the band (scale) j . 
2. Entropy definition: 

The function h in l|16|l assesses the amount of information carried by a specific wavelet 
coefficient. Several functions have been proposed for h. A discussion and comparison 
between different entropy definitions can be found in 



the NOISE-MSE entropy 



Starcketal 



Starck et al 



(2005). We choose 



I 2001) for the SZ reconstruction problem in 



which the entropy is derived using a model of the noise contained in the data: 



h(wj(k, I)) 



P n (\ Wj (k,l)\-u)( 



dh{x) 



,.du 



(17) 



J '' 1 dx 

where P n (wj(k,l)) is the probability that the coefficient Wj(k,l) can be due to the 
noise: P n (wj(k,l)) — Prob(W^ >| Wj(k,l) |). For Gaussian noise, we have: 
2 



P n (wj(k,l)) 



27T<7i 



exp{-Wy2aj)dW 



j J\ Wj (kd)\ 

erfc(^M) 



a/2ctj 

and equation 1171 becomes 1191 



h(vjj{k, I)) = — 



w ^ , \wj(k,l) | -u 

u erfc — - — = )du 



(18) 



(19) 
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The NOISE-MSE is very close to the l\ norm (i.e. absolute value of the wavelet 
coefficient) when the coefficient value is large , which is known to pro duce good results 



2003). 



for the analysis of piecewise smooth images ( Donoho fc Ela' 
3. Signal and noise information: 

The SZ component obtained by blind separation is swamped by noise. The following 
algorithm assumes that the observed map can be decomposed as: 

S obs = S + N (20) 

Then, we can decompose the information contained in our image in two components, 
the first one (H s ) corresponding to the non corrupted part, and the other one (H n ) 
describing a component which contains no information for us: 

H(S obs (k,l))=H s (S obs (k,l))+H n (S obs (k,l)) (21) 

For each wavelet coefficient Wj(k,l), we have to estimate the fractions h n and h s of 
h: 

l N 3 i Nj 

H(S obs (k,l))=J2Yl h s ( Wj (k,l)) + J2 E ^hK(M)) (22) 

j = l k,l=l j=l k,l=l 

(23) 



5.2.3. Multiscale Entropy Filtering 

1. Filtering: 

The problem of filtering the observed map S bs can be expressed as follows. We look 
for a filtered map Sf such that the difference between Sf and S obs minimizes the 
information due to the signal (to recover all the signal) and such that Sf minimizes 
the information due to the noise (we want no noise). These two requirements are 
somehow competing. A tradeoff is necessary, because, on one hand, we want to remove 
all the noise (heavy filtering) and on the other hand, we want to recover the signal 
with fidelity. In practice, we minimize for each wavelet coefficient Wj(k,l) : 

l(w 3 (fc, 0) = K ( Wj (k, l) ~ w 3 (fc, 0) + /3.h n (wj (k, I)) (24) 

where Wj(k,l) are the wavelet coefficients of the observed map S obs , Wj(k,l) the 
wavelet coefficients of the filtered map Sf and (3 is the so called regularization 
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(trade-off) parameter. 



2. Selecting significant Wavelet coefficients: 

Whatever the filtering, the signal is always substantially modified. We want to 
fully reconstruct significant structures, without imposing strong regularization while 
eliminating effi ciently the noise. The introduction of the multiresolution support 
i Murtagh et alJl995 ). helps to do so. The idea is to apply the previ ous regularization 
(i.e. fi ltering) only on the non-significant (noisy) wavelet coefficients (|Pantin fc Starck 



1996). The other components of the maps are left untouched. The new Multiscale 
Entropy becomes: 

h(wj(k,l)) = M(j,k,l)h{wj{k,l)) 



(25) 



M(j,k,l) 



(26) 



where M(j, k, I) = 1 — M(j, k, I), and M is the "multiresolution support" defined as 
1 if Wj (k, I) is significant 
if Wj (k, I) is not significant 
M describes, in a Boolean way, whether the data contains information at a given 
scale j and at a given position (fc, I). Wj(k, I) is said to be significant if the probability 
that the wavelet coefficient is due to noise is small. In the case of Gaussian noise, a 
coefficient Wj(k,l) is significant if | Wj(k,l) \> kaj, where <jj is the noise standard 
deviation at scale j, and A: is a constant. Without an objective method for selecting the 



thres hold, it is adjusted arbitrarly, generally taken between 3 and 5 IjMurtaeh et al 



1995). 



3. Selecting significant Wavelet coefficients using the FDR: 

The False Discovery Rate (FDR) is a new statistical procedure due to 



I Beniamini fc Hochbere 



1995) which offers an effective way to select an adaptative 



threshold to c ompute the multiresolution support . This technique has recently been 



described by (M iller et al.l l2001 



Hop kins et al 



2002; 



Starck et al 



2005) with several 



examples of astrophysical applications. The FDR procedure provides the means to 
adaptively control the fraction of false discoveries over total discoveries. The FDR 
is given by the ratio l|27|l . that is, the proportion of declared active which are false 
positives: 

Via 

D n 



TVTl 



(27) 
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where Vi a is the number of pixels truly inactive declared active, and D a is the num- 
ber of pixels declared active. The FDR formalism ensures that, on average, the False 
Discovery Rate is no larger than a which lies between and 1. This procedure guar- 
antees control over the FDR in the sense that: 



£{FVTl) < y.a < a 



(28) 



The unknown factor ^& is the proportion of truly inactive pixe ls. A complete de- 
script ion o f the FDR method c an be found in 



(2002) and 



Starck et al 



Miller et al 



(2003|). In 



Hop kins et al 



I 2005), it has been shown that the FDR outperforms stan- 



dard method for source detecti on. In this applicat ion, we use the FDR method in a 



multircsolution framework (sec 



Starck et al 



2005). We select a detection threshold 



Tj for each scale. A wavelet coefficient Wj(k, I) is considered significant if its absolute 
value is larger than Tj as seen below. 
4. Multiscale Entropy Filtering algorithm: 

Assuming Gaussian noise, the Multiscale Entropy restoration method reduces to find- 
ing the image Sf that minimizes J(5/), given the map S b s output of source separation 
with: 

12 J 



J(Sf) 



Sob 



s f 



2<j?„ 



/^EEm^/)^) 

3=1 k,l 



(29) 



where a n is the noise standard deviation in S b s , J is the number of Wavelet scales, W 
is the Wavelet Transform operator and h n (wj t k,i) is the multiscale entropy only defined 
for non significant coefficient (outside de Support selected b y the FDR th r eshold ing). 



Starck et al 



(2001) The 



Full details of the minimization algorithm can be found in 
results presented in the next section, are obtained on a SZ map with a uniform 
Gaussian white noise but the method still holds for a non uniform Gaussian noise 
over the map. 



6. Clusters detection 

6.1. Extraction algorithm - S Extractor 

We use a public sour ce software f or extraction of SZ-Cluster candidates from the fil- 



tered maps: SExtractor (|Berti SExtractor turned out to be fast, convenient, and 
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easy to configure. It handles conveniently Cluster Extraction, with observed sizes ranging 
from a few pixels up to angular diameters of a degree. We used the Noiseless configura- 
tion, assuming that our filters are efficient enough for this assumption to be valid. We 
mainly use sources identification capability and the deblending algorithms that turns out 
to be very usefull when extracting large-mass clusters. In order to recover as many candi- 
dates as possible, we set DEBLENDJVIINCONT = and DETECT.MINAREA = 5 (the 
smallest cluster size after optical beam dilution is over 9 pixels). Photometry of Cluster 
candidates is done using SExtractor's FLUX_AUTO Mode. SExtractor photometry per- 
forms very well on our data: the recovery error on integrated Compton flux Y is smaller 
than 4 percent, for Y larger than 2.5 10~ 10 sr (id 3 arcmin 2 ). Thus SExtractor out- 
puts a catalog of Cluster candidates with their position, flux and size. This catalog is likely 
to be contaminated by false detections due to residual noise, or spurious point sources 
that have survived the Source Separation step. While processing observed data, we will 
have to live with this contamination. Using simulations, we quantify in the following the 
SZ-Cluster selection function (Completeness) and this contamination. 

6.2. Completeness and Purity of the recovered catalog 

In order to identify in our catalog true SZ Clusters from contamination we used an 
association criterion: for each candidate cluster, we scan the generated cluster catalog 
(see 12 .4|) for the closest cluster and store the angular distance between the two. Doing 
so, we observed a distribution with a strong peak at a distance smaller than 4 pixels (3.5 
arcmin), corresponding to the true detected clusters, and a long flat tail corresponding 
to random associations. We then decided to tag candidates as true when their distance 
to the closest cluster in the simulated catalog is less than 4 pixels, neglecting the small 
fraction of random association passing this test. This criterion can and will be enhanced 
using an improved distance involving recovered cluster flux and spatial distance. 

Purity is defined as the ratio of true detections to the total of detections: 



true detections . . 

purity = — : (30) 

total detections 
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And Completeness is denned as the ratio of true detections to the total number of clusters 
in the simulation: 

, true detections , . 

completeness = — (31) 

total number of simulated clusters 

Completeness and purity are the too main performance criteria of a detection chain. The 

higher the better! We will use them to compare detection chains in the result section. 

7. Results 

In the following, we will quantify the performance of our algorithm chain in two steps. 
We will focus on the SZ map reconstruction: first the source separation step with JADE 
and then the additionnal FDR-MultiScale Entropy Method (ME-FDR) filtering step. 
The ME-FDR filter will be compared first to the classical Gaussian and Wiener filters. 
Then we will discuss assumptions and performances of our method relative to the one 



published in IjPierpaoli et al 



2005). 

The comparison between two methods for astrophysical image reconstruction can be 
performed in several ways. One may initially forget about the astrophysical nature of the 
image, and just investigate the characteristics of residuals. This technique is typically 
used by mathematicians when dealing with picture of diverse nature. An astrophysicist 
may be more inclined to assess how well the objects of interest (SZ cluster of galaxies, in 
this case) are recovered: how many of them are found, and with what precision. Finally, 
a cosmologist would want to know to which extent the reconstruction procedure limits 
his ability to estimate cosmological parameters. This implies consideration on the purity 
and completeness as functions of cluster mass threshold, as well as precision in the re- 
construction of the (central or integrated) Compton parameter for such masses. In what 
follows, we will take the mathematician's point of view, followed by a short astrophysi- 
cist's one: we will compare the map's reconstruction error. Then we will focus on the SZ 
Cluster detection performance of the full detection chain, by comparing the recovered 
source catalogs to the Simulated SZ cluster catalog, in terms of purity and Completeness 
as defined in section 16.21 This allows a parameter free quantification of performances 
and trade-offs involved in cluster detection, being mostly independant of photometry 
details. Note also that the cluster detection efficiency does not convey information on 
the spread of the input/output y parameter. A more detailed description of the selection 
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function, contamination, a discussion on photometry issues and finally meth ods and of 
cosmological implications will be given in a following paper IjJuin et alJ l2005). 



7.1. Real life with JADE 

Figure 01 shows the recovered SZ map, after source separation. JADE performs very 
well on our maps: the SZ signal subdominant at all scales now appears clearly. No obvious 
leftovers from other astrophysical sources are seen. A few points are worth noticing. First 
JADE assumes and requires that input data have zero mean. This point is quite easy 
to meet in bolometer experiments since our bolometer cameras measure flux variations 
while the telescope scans the sky. 

Second, JADE looses calibration information while processing data. Sky components are 
separated, but calibration of each output map has to be restored. This fact might be con- 
sidered as annoying, but it is of common observationnal procedure to calibrate a survey 
on the brightest sources in the field (which can be observed in follow up experiments). 
We chose in the following to calibrate our SZ maps using the 100 brightest clusters in 
the field. We average their recovered integrated flux (SExtractor FLUX_AUTO mode), 
and scale the map to match the average integrated flux of the 100 brightest simulated 
Clusters. In the following, before computing statistical tests, all the maps have been 
normalised this way. We foresee to suppress this feature by taking into account prior 
knowledge (the optical spectral dependance of the SZ component) in future work. 

Third, it is very important to minimise noise in the observed maps before source 
separation (JADE) too. JADE was designed to run on noiseless data. Not surprisingly, it 
is quite sensitive to noise. If noise level is not minimised before JADE, then the recovered 
mixing matrix will be inaccurate, and the SZ Cluster map will be polluted by remains 
of the other astrophysical components, inducing efficiency loses. Once the mixing matrix 
has been carefully estimated, one can apply it to the unfiltered observed maps to extract 
the observed SZ cluster map, and then apply an optimised filter and cluster detection 
algorithms to extract the cluster catalog. We chose to apply, before JADE, to the four 
observed maps a simple Gaussian filter with the widths of the optical beams. 
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Figure 4. Input SZ cluster map (Upper left) map, and the 3 maps as recovered by 
Completeness and (upper right) Gaussian filtering a — 2.5arcmin 2 , bottom left, Wiener 
filtering and bottom right ME-FDR method. We chose to plot 25 deg 2 maps, to point 
out differences that do not show on 400 deg 2 maps 

7.2. Filtering methods' performances 

Additional filtering is then applied to recover smaller clusters. We now quantify per- 
formances of the ME-FDR filter relative to the simpler filtering methods (Gaussian and 
Wiener). We chose to show results obtained with maps processed by JADE running in 
wavelet space, our best method. In a first stage, we will quantify the results by comput- 
ing an error map, and its properties. Then we will compare the output of the extraction 
procedures using the three filters. 

7.2.1. SZ map reconstruction 

Figure H| shows the maps after two classical filtering methods and our FDR multiscale 
entropy method. All maps look quite the same: in nominal noise condition of such ambi- 
tious experiments as future SZ Cluster survey, only statistical tests can sort between the 
3 filtering methods. We computed the error map (filtered map, input map subtracted) 
for each filtered option. Computing the rms of the error maps divided by the rms of the 
simulated SZ map (dimensionless rms) leads to ucauss = 0.617, uwiener — 0.602 and 
&FDR = 0.570 . 

Figure 5. Dimensionless standard deviation of error maps (filtered map, input SZ map 
subtracted) at each scale of the wavelet decomposition, for the three studied filters op- 
tions. ME-FDR filtering method (red squares) is a significant improvement compared to 
simpler filtering methods (Gaussian, black dots and Wiener, blue triangles). 
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A more accurate test is to plot (see figure |5J) the dimensionless standard deviation of 
error maps (filtered map, input map subtracted) at each scale of the wavelet decompo- 
sition: 

a L = (32) 

&Sim,L 

where ctl, &Err,L and <7sim,L, are the standard deviation of the error map and the input 
SZ map, selecting the scale L of our wavelet transform. ME-FDR method performs better 
than simpler filters at all scales. Running our algorithms while changing the noise level 
showed that the higher the noise level, the larger the gain in using our ME-FDR filtering 
compared to simple Gaussian, or Wiener filters. 



Figure 6. Purity versus Completeness for the three filtering options presented in this 
paper. Completeness + ME-FDR (red squares) out performs the other filter options 
Gauss (black dots) and Wiener (blue triangles), allowing larger Completeness at any 
required level of purity. In the exemple presented here we simulated 9942 clusters ac- 
cording to ACDM distribution, with a threshold value on Ycomp of 2 10~ 12 st: note that 
the Completeness is normalized to this total number of simulated cluster. The insert is 
a zoom on the high purity region of the graph. 



7.2.2. Full extraction chain 

Then considering our goal of detecting clusters, the relevant test is to compare the 
recovered catalog to the simulated cluster catalog, input of the simulated map. Observed 
cluster catalogs are extracted and processed as explained in 16.21 In a future paper (JB. 
Juin et al.) we will present selection function, contamination, a discussion on photometry 
issues and finally constraints on cosmology. In this paper where we discuss detection 
procedures the relevant information is the curve of purity versus Completeness while 
rising threshold on cluster observed flux (figure[()J). As expected, the higher the threshold, 
the higher the purity, and the lower the Completeness. We see once more that ME-FDR 
out performs the other filter options, allowing larger Completeness at any required level 
of purity. 
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7.3. Comparison with another Wavelet based methodology 

In what follows, we will compare the method described above with the one presented in 



Pierpaoli et al 



1 2005|). In line with the spirit of the previous sections, we will compare the 
general performance of the two methods in reconstructing the SZ images and then assess a 
global purity and completeness, as defined in section lrT2l As we are not as sessing here the 



Pierpaoli et al 



goodness of the y parameter reconstruction, a full comparison with the 
1 2005) results (i.e. purity and completeness as a function of cluster mass threshold, spread 
of the input /output y parameter relation ) is not possible at this time 
We will first describe the 



Pierpaoli et al 



1 2005) method, and then preform the com- 



parison according to the mentioned criteria. 



7.3.1. Description of the alternative image reconstruction method 



The method presented in 



Pierpaoli et al 



] j20o3), 



is formally a non-blind component 



separation that has been optimized in order to recover galaxy clusters. In this method 
component separation and the deconvolution of the beam effect are done in one step by 
computing the Bayes Least Square estimate under a Gaussian scale mixture model of 
"neighborhoods" of wavelet coefficients. These "neighborhoods" are sets of wavelet coef- 
ficients which are associated with the same location and behave in a coherent manner. 
The optical frequency dependency of each component, the beam size and the noise level 
for each observation are assumed to be known. The procedure relies on the possibility to 
discriminate between the different components (CMB, clusters, Infra-Red point sources 
and Galaxy Dust) b y modeling the 



Scale Mixtures (see 



Pierpaoli et al 



o int pr obability of these neighborhoods by Gaussian 
1 2005J) for details). A Gaussian scale mixture is a 
random vector x — \fzu, where u is a Gaussian vector independent of the scalar random 
variable z - allowed to be non-Gaussian. 

The model is determined by the probability distribution p z and the covariance matrices 
of u. The covariance matrices have to be adjusted at each scale of the wavelet decompo- 
sition as a function of the resolution of the observed maps. They are computed prior to 
running the algorithm on simulated maps of each component s at the correct r esolut ion. 

1 20051) in- 



The probability distribution p z is also component dependent. 



Pierpaoli et al 



vestigate several different possibilities for p z , showing that different distributions lead to 
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very different results in the mass reconstruction. For example, the Gaussian assumption 
(Pierl in the following) is most adequate to recover the maximum number of clusters but 
is however inefficient in reconstructing the appropriate central intensity of these clusters; 
while a non-Gaussian distribution modeled on the simulated SZ map (Pierl) is most ad- 
equate to recover the input Comptonisation parameter for bright clusters while missing 
the reconstruction of very low intensity ones. 



Figure 7. Maps computed by two of the algorithms presented in the following of 25 deg 2 
size. Left is input SZ Cluster map. Centered is the map output of Completeness after 
ME-FDR filtering. Right is the map output of the Pierpaoli et al. method, using the 
Gaussian assumption for probability distribution. 



In 



Pierpaoli et al 



1 2005) the focus is on reconstructing the signal of the most massive 
clusters, since those are the ones which are likely to lead to the most precise constraints 
on cosmological parameters. To this aim, the distribution Pier2 is preferable to Pierl . 
As this paper is more focused on cluster detection efficiency (also including low-intensity 
clusters) and disregards the precision of the reconstruction, we will consider here the 
Gaussian prior for p z (Pierl in what follows). We remind the reader that a Gaussian 
prior for p z reduces the estimator to a local Wiener filtering in Wave let space, as al l 



Pierpaoli et al 



(2005) 



information about the signal non-Gaussianity is lost. The result of 
is a set of beam-deconvolved maps (one for each physical component considered) which 
can be directly compared with the input ones, or with other method's. The cluster's y 
map is the one we are using here for co mparison. As for cluste r detection, we will present 



results obtained with the code used in 
code adopted here. 



Pierpaoli et al 



I 2005), as well as the SExtractor 



Figure 8. Standard deviation versus scale for SZ cluster maps recovered by Completeness 
ME-FDR method (red squares) and Pierpaoli et al. Pierl method (blue triangles). The 
top axis reports the wavelet's typical scale (in arcmin) corresponding to the index on the 
bottom axis. 
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7.3.2. Map comparison 

In figure we present the maps that will be used for the comparison. While both 
maps recover well the intense clusters, the Pierl processed map shows more low-intensity 
structure than the ME-FDR map. This could be due to a better reconstruction of low- 
intensity clusters, as well as undesirable noise. Computing the Pierl error map rms, we 
find apieri — 0.589, which should be compared to 0.570 for the Completeness ME-FDR. 
An analysis of the residuals' normalised standard deviation for different wavelet scales 
(shown in figure |Sj shows that the two methods mainly differ at angular scales equal and 
above 14 arcmin. The major contribution to the difference in the total rms is therefore 
not associated with the typical (low mass) cluster scale, but to much larger ones. 

Figure 9. Purity versus Completeness curves for the four software detection chains stud- 
ied in the following. We use red squares for Complctcncss-FDR+SExtractor, black dots 
for Pierl and blue triangles for Pier2 with peak detection, black diamonds for Pierl using 
SExtractor for source detection. The inserted plot is a zoom on the high purity region of 
the graph. 



7.3.3. Detection efficiency 

We compare here the detection efficiency defined in section lt>.2l For consistency with 
our detection chain (see section Tfl} . we recalibrate the Pierpaoli's maps so that the av- 
erage integrated Compton Flux of the 100 brightest clusters match the value observed in 
the input SZ cluster map (small correction). Figure[!|]presents the curves of Completeness 
versus purity for Complet eness FDR-ME + SEx tractor, Pierl, Pier2 using the peak find- 



ing algorithm presented in 



Pieroaoli et al 



I 2005) to detect clusters and the Pierl method 



using SExtractor (PierNew). As expected, Pier2 is not as good as the other methods at all 
purity, since the probability p z used here is optimized for accurate recovering of the most 
massive cluster's central intensity and not for cluster detection. Pierl (the Gaussian 
distribution) improves Pier2 results, especially when SExtractor is used to detect the 
clusters (PierNew). At low threshold (low purity) PierNew recovers more clusters than 
ME-FDR. The low-intensity structures visible in figure contain a sizable number a 
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clusters. As Completeness FDR-ME is designed to filter out noise better, it also filters 
out some low-intensity clusters which then cannot contribute to the detection rate any 
longer. At high purities, PierNew and Completeness FDR-ME provide equivalent perfor- 
mance, given the statistical uncertainly of the small number of clusters involved in the 
high-threshold cut. 



8. Conclusion 

We simulated observed sky maps at frequencies of a typical large multiband bolomet- 
ric SZ-Cluster survey. We implemented a complete software detection chain, working in 
3 steps. First we use a source separation algorithm, that is based on a Wavelet trans- 
form and the JADE-ICA algorithm. Then, we filter the SZ-Cluster map using an FDR- 
Multiscale Entropy method. Finally, we detect the clusters on the filtered maps using the 
SExtractor software in a noiseless configuration. This detection chain is very efficient, 
yielding 25% of clusters of mass larger than 1O 14 M detected with 90% purity. We com- 



pare our detection algorithm to previously published wavelet based ones IjPierpaoli et al 



2005J). We restrict our comparison criteria to a global detection efficiency, as defined in 
section \Q. 21 We find a detection efficiency in the high-purity region comparable to the 
Gaussian probability case (Pierl method), which is the model that provides the bes t per- 



formance for this comparison among those presented in the IjPierpaoli et al 



2005). The 



ME-FDR detection efficiency slightly degrades at lower cluster intensity with respect to 



Pierl method, as ME-FDR filters out low-intensity clusters durin g the denoising 



proce 



dure. These results, however, are impressive as ME-FDR, unlike (jPierpaoli et alJl2005j) 
methods, is a blind algorithm that makes no assumption on the physical properties of 
the signal to be recovered. 

Our future algorithmic work will be focused on improving the source separation step. 
Physics brings, along with data, a lot of prior knowledge on sky components, and we will 
use a sensible part of this knowledge, mainly optical spectrum knowledge, in the source 
separation algorithms. This will allow to overcome the intrinsic scaling indeterminacy 
of the blind linear mixture model and so prevent loosing track of map calibration as 
in section 17.11 Additionally, JADE assumes to run on noise-free data. We will design 
our source separation algorithm to handle noisy data. Of course detection efficiency 
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doesn't provide all the information we want to know in order to do cosmology, as the 
accuracy of the reconstruction and photometry issues are also important. In our next 



papers l|Juin et al 



2005) we will use these algorithms to compute selection functions, 



contaminations and constraints on cosmology foreseen for the upcomming bolometric 
SZ-cluster surveys. 
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